Construction and validation of a predictive model of mortality of tuberculosis-destroyed lung patients requiring mechanical ventilation: A single-center retrospective cohort study

The mortality rate for intensive care unit tuberculosis-destroyed lung (TDL) patients requiring mechanical ventilation (MV) remains high. We conducted a retrospective analysis of adult TDL patients requiring MV who were admitted to the intensive care unit of a tertiary infectious disease hospital in Chengdu, Sichuan Province, China from January 2019 to March 2023. Univariate and multivariate COX regression analyses were conducted to determine independent patient prognostic risk factors that were used to construct a predictive model of patient mortality. A total of 331 patients were included, the median age was 63.0 (50.0–71.0) years, 262 (79.2%) were males and the mortality rate was 48.64% (161/331). Training and validation data sets were obtained from 245 and 86 patients, respectively. Analysis of the training data set revealed that body mass index <18.5 kg/m2, blood urea nitrogen ≥7.14 mmol/L and septic shock were independent risk factors for increased mortality of TDL patients requiring MV. These variables were then used to construct a risk-based model for predicting patient mortality. Area under curve, sensitivity, and specificity values obtained using the model for the training data set were 0.808, 79.17%, and 68.80%, respectively, and corresponding values obtained using the validation data set were 0.876, 95.12%, and 62.22%, respectively. Concurrent correction curve and decision curve analyses confirmed the high predictive ability of the model, indicating its potential to facilitate early identification and classification-based clinical management of high-risk TDL patients requiring MV.


Introduction
Tuberculosis (TB) is a chronic infectious disease caused by Mycobacterium tuberculosis.In 2021, about 10.6 million new TB cases and approximately 1.5 million TB-related deaths, of which 90% were caused by pulmonary TB, were reported globally. [1]Importantly, pulmonary TB is associated with a high lung injury rate approaching 68%, [2][3][4] with lung injury leading to development of TB-destroyed lung (TDL) disease in 1.30% of cases. [5]TDL is a consequence of improper anti-TB treatment management during the early TB disease stage that causes marked unilateral or bilateral erosion and/or deterioration of lung tissue structure that results in insufficient gas exchange area.As the disease progresses, lung function continues to decline until the normal exchange of inhaled air or oxygen is no longer adequate, resulting in irreversible pulmonary ventilation and ventilation dysfunction [6] that are often accompanied by different degrees of hypoxia, repeated secondary infections, hemoptysis, empyema, and pneumothorax.Notably, the TDL patient mortality rate has been reported to be 28%, [7][8][9][10][11] while KC and YM contributed equally to this work.

This work was supported in part by special funds of Chengdu Science and Technology Bureau (2022-YF05-02139-SN), Chengdu Health Commission (2022101) and Sichuan Provincial Science and Technology Department (23ZDYF1334).
The authors have no conflicts of interest to disclose.
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

The retrospective study was carried out in accordance with the principles of the Declaration of Helsinki. This study was approved by the Medical Ethics Committee of Public Health Clinical Center of Chengdu (YJ-K20220201
) and exempted the study from the informed consent requirement.
the mortality rate of TDL patients requiring mechanical ventilation (TDL-MV) is much higher (61%). [11]Therefore, early identification of risk factors affecting TDL-MV patient prognosis and clinical decision-making with regard to implementation of active interventions are of great importance for improving TDL-MV patient prognosis.
At present, Acute Physiological and Chronic Health Evaluation (APACHE II) and Sequential Organ Failure Assessment (SOFA) scores are often used to assess disease severity of intensive care unit (ICU) patients with TDL-MV.However, these scoring systems do not provide consistent and accurate mortality risk estimates for patients with specific diseases and thus are of limited value when used for this purpose.For example, although APACHE II scoring has been widely used for conducting clinical early prognostic assessments of patients with certain critical diseases, this method requires collection of numerous types of data and thus is complicated, difficult to implement and performs poorly when used within 24 hours of ICU admission. [12]SOFA scoring is also of limited value, since this method is mainly used to assess critically ill patients with sepsis or organ dysfunction. [13,14]Meanwhile, it has been reported that APACHE II and SOFA scoring-based methods cannot be used to accurately predict TDL-MV patient mortality rates. [15]evertheless, in the face of relentless global TB epidemics, increased clinical awareness of TB pulmonary injury has stimulated research to enhance understanding of its impact on patient outcomes, including a small number of studies based on small numbers of patients exploring risk factors affecting TDL-MV patient prognosis. [7,11]In order to build on the results of those limited studies, here we identified clinical characteristics and independent risk factors related to ICU TDL-MV patient mortality and used them to construct a predictive model.The resulting model should help clinicians achieve early prognosis and improved interventional treatments of TDL-MV patients in order to improve treatment outcomes for this patient population.or <6-8 breaths/min, abnormal respiratory rhythm or weak or absent spontaneous respiration; (3) PaO 2 < 50 mm Hg (especially after inhalation of oxygen); (4) progressive increase of arterial pressure of carbon dioxide (PCO 2 ) level and pH decrease; (5) respiratory failure after conventional treatment and/or a trend of worsening respiratory function.

Study design
Inclusion criteria: (1) patients ≥18 years of age; (2) patients meeting both TDL diagnostic criteria and MV therapeutic indications.
Exclusion criteria: (1) younger than 18 years of age; (2) TDL patients who did not require MV; (3) patients with endstage disease, irreversible disease, and patients who could not benefit from ICU treatment; (4) patients who died or were discharged within 24 hours; (5) patients with incomplete clinical data.

Clinical data and laboratory test results
To determine risk factors that are associated with TDL-MV patient mortality, the following data were collected: (1) patient demographic characteristics, including gender, age, race, residence, body mass index (BMI), marriage history, smoking, and alcohol consumption history.(2) TB epidemiological characteristics, including bacillus Calmette-Guérin (BCG) vaccination history and history of TB and/or drug-resistant TB infection.
(3) Complications, including hemoptysis, pneumothorax and septic shock.(4) Comorbidities, including hypertension, diabetes, chronic liver disease, chronic kidney disease, HIV infection, and chronic pulmonary heart disease.( 5) Critical illness score obtained within 24 hours after check-in based on APACHE II or SOFA scoring criteria.(6) Laboratory examination findings obtained within 24 hours after check-in that included blood levels of white blood cells, red blood cells, hemoglobin, platelets, alanine aminotransferase, aspartate aminotransferase, total bilirubin, albumin, blood urea nitrogen (BUN), serum creatinine, pH, PCO 2 , ratio of partial pressure of arterial oxygen to fraction of inspired oxygen (PO 2 /FiO 2 ), lactic acid and levels of CD3 + T cells, CD4 + T cells, and CD8 + T cells.(7) Findings indicating damaged lung.(8) 30-day prognosis (survival vs non-survival).

Construction and validation of prediction model
Patients who provided model training set data were assigned to survival and non-survival groups according to 30-day prognosis results.In order to identify variables with statistical and clinical significance, patient data were screened via univariate analysis (P < .05).Next, potential risk factors affecting TDL-MV patient prognosis were identified using univariate COX regression analysis (P < .1)then independent prognostic risk factors were screened using stepwise multivariate COX regression analysis (P < .05).Thereafter, the prognostic model of TDL-MV patient mortality was generated based on the abovementioned mortality-related independent risk factors and their associated β coefficients.
The effectiveness of the model was internally verified using the training data set via 3 types of analysis: receiver operating characteristic curve analysis (prediction model, APACHE II score, and SOFA score), calibration curve analysis and decision curve analysis (DCA).Next, the validation data set was used to validate the model using the above mentioned analyses then the clinical efficacy of the model for predicting patient mortality was evaluated.Model-based predictions related to TDL-MV patient mortality were next stratified into low-risk and highrisk predictions of mortality according to scores obtained using the training set then Kaplan-Meier survival curves were plotted based on mortality rates obtained from training and validation data sets.

Statistical analysis
Data were analyzed using IBM SPSS Statistics 23.0 statistical data analysis software (IBM, Armonk, NY).Receiver operating characteristic curve and survival curve analyses were conducted using MedCalc (MedCalc Software, Ltd., Ostend, Belgium).Calibration curves and decision curves were drawn using R software (version 4.5.3,https://CRAN.R-project.org,R Foundation, Vienna, Austria).The calibration curves was plotted using the "rms" package.The DCA was performed with the "rmda" package.
The Kolmogorov-Smirnov test was used for normality testing of continuous variables then normally distributed continuous variables were expressed as the mean ± standard deviation.The independent sample t test was used to conduct comparisons between groups.Non-normally distributed continuous variables were expressed as quartile median values and inter quartile range values.Comparisons between groups were conducted using the Mann-Whitney U test.Categorical variables were analyzed using the Chi-square test for continuity correction (χ 2 ).For univariate Cox regression analysis, variables with P values of <.1 were selected for multivariate Cox regression analysis; those with P values of <.05 were considered statistically significant.Survival curves were plotted using the Kaplan-Meier method.

Demographic and clinical characteristics of the study population
Ultimately, 331 TDL-MV patients who met all inclusion and exclusion criteria were included in the study, with demographic and clinical characteristics of all patients shown in Table 1 Of the 331 TDL-MV patients, 28 were afflicted with miliary pulmonary TB with extrapulmonary TB involving the central nervous system TB (n = 13), tuberculous pericarditis (n = 45), TB of the abdominal cavity (n = 35), and urinary TB (n = 7).No statistical differences in baseline characteristics were observed between the training set and the validation set except for BCG vaccination history, chronic kidney disease history, chronic pulmonary heart disease, and blood lactic acid level (more detailed information is presented in Table 1).

Construction of the mortality prediction model
Univariate analysis of training set data revealed 13 variables with statistically significant associations with patient mortality , including patient age, BMI, septic shock, and blood levels of platelets, serum albumin, urea nitrogen, serum creatinine, lactic acid, arterial blood pH, PO 2 /FiO 2 , and blood levels of CD3 + T cells, CD4 + T cells, and CD8 + T cells (P < .05)(Table 2).Of these variables, those significantly associated with greater mortality (P values of <0.1) were identified using univariate COX regression analysis (detailed information presented in Table 3), including BMI, septic shock, PO 2 /FiO 2 , levels of serum albumin, BUN and serum creatinine, and blood levels of CD8 + T cells.Ultimately, variables with P values of <.5, including BMI, septic shock, and BUN (Table 4), were considered independent risk factors for increased TDL-MV patient mortality as based on regression β values (Table 5).Variables with β values of <1 received 1 point and those with β values of >1 received 2 points; independent variable scores for each patient were added together to obtain model-based mortality risk prediction scores that the minimum was 0 and the maximum was 4 points.

Validation of the mortality prediction model
Using the training data set to test the risk prediction model revealed an area under curve (AUC) value for 30-day TDL-MV patient mortality of 0.808, a sensitivity rate of 79.1%, and a specificity rate of 68.80%.Meanwhile, the SOFA score AUC value was 0.702, the sensitivity rate was 49.17%, and the specificity rate was 83.20%, an APACHE II score-based AUC value of 0.696, sensitivity rate of 50.00% and specificity rate of 77.60% (Fig. 2A).
Use of the validation data set to validate the mortality risk prediction model revealed an AUC value for 30-day TDL-MV patient mortality of 0.876, a sensitivity rate of 95.12% and a specificity rate of 62.22%.Meanwhile, the SOFA score-based AUC value was 0.738, the sensitivity rate was 53.66% and the specificity rate was 84.44%, while the APACHE II score-based AUC value was 0.803, the sensitivity rate was 82.93% and the specificity rate was 66.67% (Fig. 3A).Notably, predicted mortality rates obtained using the risk prediction model for both the training and validation data sets were close to the actual observed TDL-MV patient mortality rate, thus indicating good calibration curve agreement (Figs.2B  and 3B).Moreover, results obtained via DCA also showed that the prediction model provided a good net benefit (Figs.2C and  3C).

Prognostic comparison and risk stratification
Analysis of training data set-derived patient scores generated using the risk prediction model revealed an increase in TDL-MV patient mortality rate with increasing score.With regard to mortality risk, comparisons of patient scores to a critical threshold value resulted in the assignment of 44 patients to the high-risk group (3-7 points) and 201 patients to the low-risk group (0-2 points) (Table 6).An analysis of Kaplan-Meier curves revealed significant differences in survival between high-risk and lowrisk patient groups as stratified using the risk prediction model developed using both training and validation data sets (Figs.2D  and 3D).

Discussion
In this study, we constructed and validated a predictive model of mortality of TDL-MV patients.Internal verification of the risk prediction model indicated good discrimination and prediction performance of the model, thus demonstrating that use of the risk prediction model would not only enable more efficient utilization of limited medical resources, but would also reduce the TDL-MV patient mortality rate.22] Septic shock is likely induced capillary endothelial injury, tissue microcirculation ischemia, and hypoxia.In turn, these effects can lead to vascular paralysis, organ function impairment, and severe microcirculatory dysfunction that can persist even after normalization of hemodynamics has been achieved through fluid resuscitation and administration of vasopressors and anti-infective therapies. [23][26] The results of a retrospective nested cohort study reported in 2013 indicated that although clinical manifestations of M tuberculosis infection-induced septic shock resembled those associated with other bacterial infections, the mortality rate of M tuberculosis infection-induced septic shock (79.2%) was significantly greater than corresponding rates obtained for all other types of bacterial infections (49.7%). [27]Indeed, in the current study the mortality rate of ICU TDL-MV patients with septic shock was markedly higher (~92.3%), a result that may reflect the fact that our patients were late-stage TB patients and thus had more severe TB disease than patients in the above mentioned studies.[30][31] A retrospective observational study reported in 2021 involving 845 patients with acute exacerbations of chronic obstructive pulmonary disease which revealed that the elevated BUN level was associated with increased acute exacerbations of chronic obstructive pulmonary disease patient mortality. [32]Meanwhile, results of an other retrospective study involving 2682 ICU patients with intracranial hemorrhage indicated that an elevated BUN level was significantly associated with poor prognosis. [33]In addition, results of other studies have confirmed that an elevated BUN level is associated with increased long-term mortality of severely ill patients. [34]imilarly, in our study of TDL-MV patients, the average BUN level of the patient group with greater risk of death exceeded that of the patient group with greater survival.Moreover, an increased BUN level was found to be an independent risk factor for increased TDL-MV patient mortality.The underlying mechanism needs further study.
In recent years, the mortality prediction of ICU patients has been widely investigated, but the general ICU severity predictive model were not sufficient for predicting mortality in the population of TDL-MV patients accurately and reliably.In our study, analysis of risk prediction performance of the model using calibration curve and decision curves generated revealed that the model had excellent prediction ability and predicted patient mortality more accurately than did predictive methods based on traditional APACHE II and SOFA scores.Meanwhile, the prediction model we have developed has significant advantages, and its index design is simple and clear, easy to obtain, which greatly facilitates the clinical operation.This model shows important application value in clinical practice.First, it can help us to identify patients who are at high risk of death early, so as to provide them with timely medical intervention.Secondly, the model can quantify the scores of critically ill patients, provide objective reference for doctors, and make individualized treatment for patients with high risk of death.In addition, the model can allocate limited medical resources reasonably and effectively  based on the severity of patients to ensure the maximum use of resources.
Our study had several limitations.First, it was a singlecenter, retrospective, observational cohort study based in western China and thus our patient population may not be representative of TDL-MV patients in other regions or countries.Therefore, multi-center, prospective studies of TDL-MV patient prognostic risk factors are needed to externally validate the model.Second, the economic health of western China is relatively poor, especially in rural areas.As a result, some TDL patients resource-poor regions who need mechanical ventilation do not have access to ICUs and thus are under represented in ICU patient populations in such regions, thus leading to bias.Third, the prediction model was constructed using a retrospective design that would inevitably introduce bias into the results to thereby reduce the statistical strength of our conclusions.Fourth, although laboratory data may change as TB progresses, such changes are not adequately captured in retrospective study-based predictive models, since such studies cannot incorporate dynamic changes of indicators in the final analysis.

Conclusion
In this study, a mortality prediction model based on independent risk factors of BMI < 18.5 kg/m 2 , septic shock, and BUN ≥ 7.14 mmol/L was found to have good predictive  power and differentiation ability.Moreover, internal validation of the model using retrospective data of adult TDL-MV patients admitted to the ICU of Chengdu Public Health Clinical Medical Center revealed that results obtained using the model were highly stable, reliable and reproducible.Thus, this model may be clinically useful for achieving early and rapid screening of patients with TDL who need mechanical ventilation and thus are at high risk of death.

Figure 1 .
Figure 1.Flow chart of participants included in this analysis.

Figure 2 .
Figure 2. Tranining set.APACHE II, Acute Physiology And Chronic Health Evaluation II; SOFA, Sequential Organ Failure Assessment; DCA, decision curve analysis.(A) Prediction model and other clinical scores receiver operating characteristic (ROC) curve.(B) Calibration curves for prediction model (B = 100 repetitions, boot mean absolute error = 0.014; n = 245).The horizontal axis was the predicted probability of 30-days death by the model, and the vertical axis was the actual probability.The dashed line indicates the predicted probability completely fits the actual probability.(C) Decision curve analysis for prediction model.The y-axis measures the net benefit.The red line represents the prediction model.The gray dotted line represents the assumption that all patients dead.The black dotted line represents the assumption that no patients dead.DCA: decision curve analysis.(D) Kaplan-Meier survival curves of the patients with tuberculosis-destroyed lung requiring mechanical ventilation training set.

Figure 3 .
Figure 3. Validation set.APACHE II, Acute Physiology And Chronic Health Evaluation II; SOFA, Sequential Organ Failure Assessment; DCA, decision curve analysis.(A) Prediction model and other clinical scores receiver operating characteristic (ROC) curve.(B) Calibration curves for prediction model (B = 100 repetitions, boot mean absolute error = 0.023; n = 86).The horizontal axis was the predicted probability of 30-days death by the model, and the vertical axis was the actual probability.The dashed line indicates the predicted probability completely fits the actual probability.(C) Decision curve analysis for prediction model.The y-axis measures the net benefit.The red line represents the prediction model.The gray dotted line represents the assumption that all patients dead.The black dotted line represents the assumption that no patients dead.DCA: decision curve analysis.(D) Kaplan-Meier survival curves of the patients with tuberculosis-destroyed lung requiring mechanical ventilation in validation set.

Table 1
Baseline characteristics of the study population.The continuous variables of normal distribution are expressed as mean ± standard deviation, the non-normal continuous variables are expressed as median (25th, 75th range), and the categorical variables are expressed as number (%).APACHE II = Acute Physiology And Chronic Health Evaluation II, BMI = body mass index, SOFA = Sequential Organ Failure Assessment. Note:

Table 2
Baseline characteristics of the study population in the training set.

Table 3
Assignment table.

Table 4
Univariate and multivariate Cox regression analysis of clinical factors in the training set.

Table 5
Construction of a risk prediction model for death in patients with TDL-MV.
BMI = body mass index.

Table 6
Mortality rates of the TDL-MV patients according to the risk prediction model.